Optimizing machine learning inference queries with correlative proxy models

نویسندگان

چکیده

We consider accelerating machine learning (ML) inference queries on unstructured datasets. Expensive operators such as feature extractors and classifiers are deployed user-defined functions (UDFs), which not penetrable with classic query optimization techniques predicate push-down. Recent schemes (e.g., Probabilistic Predicates or PP) assume independence among the predicates, build a proxy model for each offline, rewrite new by injecting these cheap models in front of expensive ML UDFs. In manner, unlikely inputs that do satisfy predicates filtered early to bypass show enforcing assumption this context may result sub-optimal plans. paper, we propose CORE, optimizer better exploits correlations accelerates queries. Our solution builds online leverages branch-and-bound search process reduce building costs. Results three real-world text, image video datasets CORE improves throughput up 63% compared PP 80% running it is.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimizing PID parameters with machine learning

This paper examines the Evolutionary programming (EP) method for optimizing PID parameters. PID is the most common type of regulator within control theory, partly because it’s relatively simple and yields stable results for most applications. The p, i and d parameters vary for each application; therefore, choosing the right parameters is crucial for obtaining good results but also somewhat diff...

متن کامل

Dual Inference for Machine Learning

Recent years have witnessed the rapid development of machine learning in solving artificial intelligence (AI) tasks in many domains, including translation, speech, image, etc. Within these domains, AI tasks are usually not independent. As a specific type of relationship, structural duality does exist between many pairs of AI tasks, such as translation from one language to another vs. its opposi...

متن کامل

Dust source mapping using satellite imagery and machine learning models

Predicting dust sources area and determining the affecting factors is necessary in order to prioritize management and practice deal with desertification due to wind erosion in arid areas. Therefore, this study aimed to evaluate the application of three machine learning models (including generalized linear model, artificial neural network, random forest) to predict the vulnerability of dust cent...

متن کامل

Machine Learning Models for Housing Prices Forecasting using Registration Data

This article has been compiled to identify the best model of housing price forecasting using machine learning methods with maximum accuracy and minimum error. Five important machine learning algorithms are used to predict housing prices, including Nearest Neighbor Regression Algorithm (KNNR), Support Vector Regression Algorithm (SVR), Random Forest Regression Algorithm (RFR), Extreme Gradient B...

متن کامل

Optimizing Queries with Materialized Views

While much work has addressed the problem of maintaining materialized views, the important question of optimizing queries in the presence of materialized views has not been resolved. In this paper, we analyze the optimization question and provide a comprehensive and eecient solution. Our solution has the desirable property that it is a simple generalization of the traditional query optimization...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the VLDB Endowment

سال: 2022

ISSN: ['2150-8097']

DOI: https://doi.org/10.14778/3547305.3547310